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Abstract 

We consider in this paper a Gaussian sequence model of observations Yi, i > 1 
having mean (or signal) 9i and variance ai which is growing polynomially like P , 
7 > 0. This model describes a large panel of inverse problems. We estimate the 
quadratic functional of the unknown signal X]i>i^? when the signal belongs to 
ellipsoids of both finite smoothness functions (polynomial weights a > 0) and 
infinite smoothness (exponential weig hts e'^*^ /3 > 0, < r < 2). We propose 
a Pinsker type projection estimator in each case and study its quadratic risk. 
When the signal is sufficiently smoother than the difficulty of the inverse problem 
(a > 7 + 1/4 or in the case of exponential weights), we obtain the parametric rate 
and the efficiency constant associated to it. Moreover, we give upper bounds of the 
second order term in the risk and conjecture that they are asymptotically sharp 
minimax. When the signal is finitely smooth with a < 7 + 1/4, we compute non 
parametric upper bounds of the risk of and we presume also that the constant is 
asymptotically sharp. 
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1 Introduction 



We observe {Yi}i=i.... 



■n 



Yi = 9i + € \/i = 1 ■ ■ ■ n 



(1) 



where are independent identically distributed (i.i.d.) random variables, having a 



mention that in case {a'i}i>i is a bounded sequence the problem is direct and when 
(Tj — > oo the problem is an inverse problem. We say that the problem is ill-posed when 
(Tj increases polynomially and severely ill-posed when it increases exponentially. 

We want to estimate the quadratic functional Q{6) = Yli^i where 9 = {9i}i>i 
belongs to the ^2-ellipsoid 



where Oj is a non decreasing sequence of positive real numbers and L > 0. We consider 
both polynomial sequence Oj = i" where we say that the signal is (ordinary) smooth 
and exponential sequence aj = exp(/?i^) where we say that the signal is super-smooth, 
a, /3 > and < r < 2. 

It is known that this model can be deduced from a linear operator equation with 
noisy observations Y = Ax + e^, where A : 7i ^ 7i is a known linear operator on 
the Hilbert space TC, x belongs to 7i is the signal of interest and ^ is a standard white 
Gaussian noise. By considering an orthonormal basis {(pi}i>i of TC, we consider only 
the sequence of values Yi := Y{ipi)/bi, where bf are the eigenvalues of AA* for i > 1. 
For more details and examples of inverse problems that can be written in the form 
([1]) we refer the reader to Cavalier et aZ.[l], [5] and references therein. We mention 
as particular examples the convolution operator, the Radon transform in the case of 
tomography or problems described by partial differential equations. 

Estimation of 6 in the inverse problem ([1]) with a quadratic risk was thoroughly 
studied in the literature from the minimax point of view. Let us only mention a 
few minimax adaptive results: oracle inequalities in Cavalier et al. [5j, sharp adaptive 
estimation by block thresholding in Cavalier et al. [3] and adaptive estimators defined 
by penalized empirical risk in Golubev [9] . 

Estimation of quadratic functionals in inverse problems was studied in two par- 
ticular problems (specified operators). Butucea [1] considered the convolution density 
model and studied the rates of a kernel type estimator. Meziani [II] estimates the pu- 
rity of a quantum state, which corresponds mathematically to a quadratic functional of 
a bivariate function of mass 1, in a double inverse problem: tomography and convolu- 



Gaussian law with zero mean and variance cr? 



.2 _ 



i^^ for some fixed 7 > 0. Let us 
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tion with Gaussian noise. Our model allows to consider more general inverse problems, 
i.e. various operators A. 

Quadratic functionals were much more studied in the direct problem (cjj bounded 
for all j) since first results given by Ibragimov and Has'minskii [10] and Ibragimov et 
al. Fan [8] gave minimax rates over hyperrectangles and Sobolev-type ellipsoids. 
Donoho and Nussbaum [B] gave Pinsker sharp minimax estimators in this model and in 
the equivalent models of fixed equidistant design regression and Gaussian white noise 
model. Fore more general bodies which are not quadratically convex, Cai and Low [2j 
showed that nonquadratic estimators attain the minimax rate of the quadratic func- 
tional. For adaptive estimators over hyperrectangles we cite Efromovich and Low [?]• 
Sharp or nearly sharp adaptive estimators over /^-bodies were found by Klemela [12] . 
Adaptive estimators over more general Besov and Ip bodies were given by Cai and 
Low [3]. In the density model, let us mention adaptive estimators via model selection 
by Laurent [13]- 

Let us underline the difference between estimating Q{6) in our model and that 
of estimating from direct data Ylj>ij'^^^'j for 7 G N as it was done, e.g., by Fan [8j, 
Donoho and Nussbaum [6] and Klemela |12j . In our case, the variance of our estimators 
is slower. When estimating the quadratic functional of a derivative, the bias is smaller, 
so the rates and constants are different. 

Here, we give a Pinsker-type projection estimator which automatically attains the 
parametric rate and the efficiency constant for all super-smooth signals and for the 
smooth signals when a > 7 -|- 1/4. Moreover, in this case we give nonparametric 
minimax upper bounds of the second order term in the quadratic risk. Our estimator 
attains the expected minimax nonparametric rate in the case of smooth signals with 
a < 7 + 1/4. We conjecture that the asymptotic constant in the nonparametric upper 
bound of the risk is sharp. The proofs of sharp lower bounds will make the object of 
future work. 

Let us mention that our method can be easily adapted for severely ill posed inverse 
problems, i.e. ai increases as an exponential. The case where = e* is of particular 
interest in practice and hasn't been studied for estimating the signal {^i}j>i either. 
Future developments should concern adaptive estimation of the quadratic functional. 

In Section [2] we describe the estimator and the precise choice of tuning parameters 
and give asymptotic upper bounds rates of convergence and associated constant. We 
postpone the proofs to the Section [3] and the Appendix. 
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2 Estimation procedure and results 



Let us define the estimator 



(3) 



1=1 



where {/ii}i>i is a sequence between and 1. We shall actually see that the optimal 
sequence is truncated, i.e. hi = for alH > 1^ and that the optimal value of W tends 
to infinity when e — > 0. 

Let us first consider the case of smooth signal: 9 £ S(q,L), where = 

Theorem 1 Let observations Yi, . . . ,Yn, ... satisfy model ([ip. Then the estimator Q 
in with parameters {/ii}i>i and W defined by 

2a\ 



hi 



1 



W 



and 



W 



L^(47 + 4a + 1)(47 + 2a + 1) V' 
4a 



4ti+47+l 



^ 4a +47+1 



is such that 



sup E 

6»eE{a,L) 



Q - Q{0) 



C(a, 7, L)e4°+47+i (1 + 0(1)), 



i/a < 7 + i 



sup E 

6»GS(a,L) 



Q-Q{e)) -462^^2^; 



i=l 



C{a, 7, L)e4«+47+i (1 + 0(1)), 



i/ a > 7 + 4 , where 



C{a, 7, L) 



47+1 



2^^40+47+1 / 2a -\- A'y + l\ 40+47+1 47+1 

I 4a J (4a + 47 + l) — . (4) 



We find a known phenomenon in quadratic functional estimation literature, i.e. the 
existence of two cases: a regular one, where the rate is parametric e~^, and an irregular 
case when the rate is significantly slower. We conjecture that Theorem [T] exhibits sharp 
asymptotic constant in this last case. 

In the regular case (when the underlying signal is smoother than the 'difficulty' of 
the operator A), Theorem [T] says actually two things. One of them is that, for each 9 
in the set S the quadratic risk of our estimator is of parametric rate and attains the 
efficiency constant in our model: 

2" 



E 



Q - Q{0) 



i=l 
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as e — > 0. Secondly, the quadratic risk is decomposed and the second order risk is 
optimized for our choice of parameters and equals the risk in the non parametric case. 

Note also, that the rates are not surprising when compared to the results of Bu- 
tucea |lj for the convolution density model. No second order terms were evaluated 
there, nor constants associated to the nonparametric rate. The efficiency constant is 
naturally different for the density model. 

Let us now consider the case of super-smooth signal: 6 G r, L), where Oj = 
exp(/3i^). 

Theorem 2 Let observations Yi, . . . ,Yn, ... satisfy model ([2]j. Let the estimator Q in 
^ be defined with parameters {/ii}i>i given by 



hi 



1 



and W solution of the equation 

^47+{i-r)+ exp(4/3VF^ - 2/3rVr^-i/(^>i)) = c(/3, r, 7, L)e-^ 

with the constant c := c{P,r,-f,L) = 2j3rL'^ if < r < I, c = L'^{e^'^ - l)/{2e'^'^) if 
r = l, c = L'^/2 ifl<r<2 and c = L'^/{2e^l^) ifr = 2. Then 

2e^ /log(l/e)^(^^+^)/" 



sup E 

flGS(/3,r,L) 



i=l 



47 + 1 



(l + o(l)). 



We note that in this case, the signal is always smoother than the difficulty of the 
inverse problem, so there is always a parametric rate term in the quadratic risk. Our 
estimator also optimizes the upper bounds for the second order term in the quadratic 
risk. In this last term, the bias term is always smaller than the variance term for 
super-smooth signals. 



3 Proofs 

Proof of Theorem [H We decompose as usually the quadratic risk E 
into bias plus variance. The bias term can be written 

2 / 00 00 \ 2 

[E[Q]-Qi9)) = [Y.h,E[Y^-e'af]-Y,0f] 



.1=1 
' 00 



Q - Q{0) 



i=l / 

00 \ 2 / 00 \ 2 

a2 



(5) 



1=1 



Ki=l 



5 



The variance term is decomposed as follows 



E 



Q-E[Q] 



E 



E 



i=l 



i=l 
oo 



^2 e'a?-ef) 



•.i=l 



Since Yi are independent and are independent Gaussian random variables: 

2" 



E 



Q - E[Q] 



a2\21 



i=l 
oo 

i=l 

oo 

Y: hf {e'E [e^] - 2e'a^E [^] + ^eH^E [^] + e'af 



i=l 

Now, use the facts that E[^f] = af and E[£,f] = Serf to get 

„-| oo oo 

i=l i=l 

oo oo oo 

= Ae^ Y: -fOf - Ae^ ^(1 - t^)afef + 2e' ^ 



E 



(Q - E[Q] 



i=l 



1=1 



i=l 



Thus by ([5|) and d?]) we get 



E 



Q - Q{0) 



AQ{h,e) + Ai{h) + A2{e) - A^{h,e), 



where 



oo 

A,{h)=A, := 2e'Y.^'iat, 

i=l 

oo 

i=\ 

oo 

i=l 

If we note T{h, 9) := AQ{h, 6) + Ai{h) = Aq + Ai, then we want to find 



infsupr(/i,6') < supr(/i,6') < sup T{h,e) 



where the infimum is taken with respect to all sequences h such that < /ij < 1 for all 
i > 1 and with 

5S = |e:f;a?^f = L|. (9) 

Let us define F{h, 6) = T{h, e)-n (E^^i afdl - L) with k > 0. Then for all j G N* 
the optimal h and 9 have to verify 

Af(A,«) = and ^m9) = 0. 

We get 

where K > 0. Let us write hj = ^1 — -^^^ where W ^ oo when e ^ 0. 

Recall that E = E(a,L) = (0 : YlZi^'^^^i ^ ^} t^^en for 6* G aE(a,L) we can 
write both 

oo W 
i=l i=l i>W 



L 



i=l i>W 



and 

i=l i=l 



i=l i>VF 

Therefore Aq = L^W~^°'{1 + o(l)), as e ^ 0. This means also that we can write 



{e*f = — I: — 1 



Let us now compute the optimal W, using again the fact that 9* G d'E{a,L) which 
is equivalent to 

oo 

^f"i9tf = L. 

i=l 
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This is further equivalent to 



giving 



Therefore 



w 



w4a+47+l_|_ / 

1=1 ^ ^ 



47+2a 



w 



2a^ 



2e4 



(47 + 4a + l)(47 + 2a + l)^'^ + o[l)) - 



W 



4q: + 47 + 1 



e ■*«+47+i (1 + 0(1)), 



(11) 



where -6(0,7) •= {4'y+4a+i){i^+2a+i) we'll take W to be the integer part of the 
dominant term. From now on, we denote B := B{a,j). 

We have to evaluate the term defined in ([8]) . For a < 7 + | , we have 



An 



\i=l 



L'W-^"{l + o{l)) 



47+4a+l 



g47+te+l (1 + 0(1))^ 

W / . \ 47 , 



A, 



1=1 



w 



1=1 



wj 



w 



2a^ 



16a2e4H^47+i 



(47 + 1)(47 + 4a + 1)(47 + 2a + 1) 
4a 



(l + o(l)) 



47 + 1 

00 



^2(47+1) ^4aj ^^^^^^^ 

2..2 _ 866t^<^^+2-+i 1 ^ i^^y^ _ 



1=1 



L 1^ 

(l + o(l)) 



2q> 



L(67 + l)(67 + 2a + l) 

16a + 2 

= 0(l)e4«+47+i (1 + 0(1)) = 0(1)^1, 

as e ^ 0. As /i^ G [0, 1] for ah i G N, the term A3 = Ae"^ E»=i(l - hf)(7fef < A2. Then 
the quadratic risk is such that 



E 



Q - Q{0) 



(^o + ^i)(l + o(l)) 

^2(47+1) ^4a 



47 + 1 

as e ^ and this explains the constant C(a, 7, L) in 



47 + 4a + 1 

e^^+'i^+i (1 + o(l)), 
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Let us note that if a > 7 + |, we can estimate the quadratic functional at the 
parametric rate as A2 is the dominant term in the risk and is of order e^. More 
precisely 

2" ~" 



E 



Q-Q{0) 



Ae'Y.^fef{l + o{l)) = A2{l + o{l)), 



i=l 



as e ^ 0. Indeed, it is easy to see that in this case 



and, moreover, 
^3 



Aq + Ai = C{a, 7, L)e4^+4"+i (1 + o(l)) = 0(^2 



w 



i=l 
W 



1- 1 



i>W 



< 



1^2a 

i=l i>W 



W 



W ^ . N 27 

< 462t^2(7-a)^M j .2.^2^4^2^2(7-.) J-^2a^2 



i=l 



i=l 



rt \ 16a+2 

< 4e^T^2('^-")L = 0(l)e4-+4^+i = o{Ao + Ai] 



as e ^ 0. ■ 

Proof of Theorem [2l We follow the lines of proof of Theorem [H In this case, 
there is always a parametric term and we do the computations of the second order term 
in the quadratic risk. 

We solve the same optimisation problem and find 

„2/3i'- 



hi=\l 



Then for 6* G L, r) we get 



00 w 



1=1 



W 



i>W 



< 



L 



i=l i>W 



and 



00 



i=l 



i=l 

00 



2l3i^ Q*2 



o2pW 



(12) 



„2/3Vl/ 
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Therefore 

Ao = L2e-W^(i + o(l)), as e ^ 0. 

By this gives Of = ^ (e^/^^^ - e^f^")^ . 

To compute optimal W , we also use the fact 9* G 9S(/3, L, r). 



2? 



i=l 1=1 i=l 

By using Lemmata [T] and [21 we have W solution of the following equation 

^47gW-2/3rH/'-l ^ ^^-4^ if 1< r < 2, 

^47g4/3H/ ^ gg-4^ if r = 1, (13) 

^47-r+lg4/3H''- ^ ^^-4^ if < r < 1, 

as e — > 0, with the constant c = c{f3, 7, L) defined in Theorem O 

We evaluate Aq + ^1: in each of the previous cases, the bias term Aq is infinitely 
smaller than the variance term Ai and the main term in Ai can be given for 



;3 

Indeed, by using Lemmata [1] and [21 



i=l i=l 



e 



2e^W^^+^ /log(l/e) \ 

-(l + o(l)) = -^Mi-^ (1 + 0(1)) = 0(^2). 



47 + 1 ' ^ " 47 + 1 V /? 



As ^0 = 0(^1) it is easy to see that in this case 

2e^ /log(l/e)\(^^+^^/'' 
^o + ^i = ^(^^) (l + o(l))=o(^2) 

as e — > 0. 

The last thing to check is that ^3 = o{Aq + ^1) as e ^ 0: 



00 w 



i=l i=l i>W 

i=l i>W 

j=l i>iy 

= Se^i^L = 0(l)T^^-+^e^ ' 



2 



52/3^^"-"" --v-y" - -[^27+lg2g2/3iy 
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So, we can write that 



By ([131), easily see that 



^3 = 0{Ai 



1 



^27+lg2/3H/'--/3riy'-lg2 ^ if 1< r < 2, 

^27+lg2/3H/g2 ^ ifr = l, 

^27+le2/3H/g2 ^ ^V^^(l+r)/2^ if < r < 1, 

Then, asW ^ oo, we get for all r g]0, 2], = o{Ai) as e ^ 0. 



4 Appendix 



Lemma 1 For a// a, 6, s > and v > 



6^ 



(1 + 0(1)), 



as V ^ oo. 



Lemma 2 For a > 0, 6 > 0, and r > as N ^ oo 



N 

E 

1=1 



iV"e''^''(l + o(l)) i/r > 1, 
^iV'^+i-''e''^'(l + o(l)) i/0<r<l, 



Proof of Lemma [H • When r > 1 



i=l 



i=l 



ia^U^ < (AT _ l)a+lgfe{^-l)^ 

i=l 



as N ^ oo. 

• When < r < 1 



fN+l ^ pN 



Use Lemma [T] and the fact that 



x'^e^'^^'dx 



When r = 1 we write both 



N 



i=l 



/ x"e^^''(ix(l + o(l). 
Jo 



N-l 



bi 



i=l 
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and 

N N-l N-l 



^^e^ = ^ (i + l)«e^^ = e** + ^ (i + Ife'' 



V. -r 

1=1 i=0 1=1 



As the sums 'Yl,i=i *"e''* and ^i^i (i + l)"e** have equivalent general terms and diverge, 
than they are equivalent to Sn-i, say. We get that, for large N, 

jSfa^b{N+l) 

Sn= (l + o(l)). 
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